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Social susceptibility is defined and analyzed using data from CNN news website. The current 
models of opinion dynamics, voting, and herding in closed communities are extended, and the 
community’s response to the injection of a group with predetermined and permanent opinions is 
calculated. A method to estimate the values of possible response in Internet communities that follow 
a specific developing subject is developed. The level of social influence in a community follows from 
the statistics of responses ("like” and ’’dislike” votes) to the comments written by the members of 
the same community. Three real cases of developing news stories are analyzed. We suggest that 
Internet comments may predict the level of social response similar to a barometer that predicts the 
intensity of a coming storm in still calm environment. 

PACS numbers: 


In recent years, governments throughout the Arab 
world have been overthrown by uprisings that followed 
the self-immolation of a single person, Mohamed Bouaz- 
izi. Similarly, the Occupy Wall Street protest movement 
was triggered by a single call to action via a social net¬ 
work. Such cases raise an important question: How can 
an individual possessing no special reputation or author¬ 
ity mobilize an entire community by a single call to stand 
and fight, while large and professionally organized com¬ 
panies may remain unnoticed? Answering this question 
will help estimating the appropriate timing and the re¬ 
quired size for an initial group to evoke a large-scale social 
response. 

A clear and strong display of personal opinions af¬ 
fects the decision-making processes of others. This phe¬ 
nomenon of social influence may be either positive or 
negative. Positive social influence facilitates the cor¬ 
related behavior called herding[Ij. Herding contributes 
significantly to the formation of market pricesHQ, the 
results of artificial market experiments]^ @ U traffic 
flows [3] , voti^ outcomesQP, and dynamics of social 
networks [l^ [I^ [12] . 

Acute herding phenomena, such as social revolutions or 
financial crises, are extremely difficult to predict, though 
they are evident when they occur [l3l|. A parameter, such 
as temperature in phase transitions, is required to es¬ 
timate the stability of a community’s opinion, i.e. the 
potential of a small perturbation to culminate in abrupt 
changes in opinion dynamics. Therefore, to understand 
the population dynamics prior to a possible transition, it 
is important to develop a quantitative analysis of herding 
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as a function of time. 

Internet communities are of special interest for the 
analysis of the herding phenomenon. Individual opinions 
are widely exposed in binary form of ” like” and ” dislike” 
votes (” likes” and ” dislikes”) over Internet news websites 
and via social networks. The data span any important 
event and expose millions of opinions [l^. Simultaneous 
analysis of a developing news story and the correspond¬ 
ing herding in relevant Internet communities may provide 
a unique opportunity to study the opinion dynamics in a 
population as it approaches a critical point and becomes 
unstable. To the best of our knowledge, the definition 
and evaluation of the temporal dynamics of herding phe¬ 
nomenon in Internet communities remains a challenge. 

In this Article, we estimate the social influence as a 
function of time in Internet communities that followed 
any of the following three news stories reported on the 
CNN website: the Zimmerman trial, Iran Nuclear Ne¬ 
gotiations, and the US government shutdown of 2013. 
We show continuous herding dynamics in all three cases 
and significant amplification of social influence near the 
verdict announcement in the Zimmerman case. The 
method we propose allows for the quantitative estima¬ 
tion of a community response to the injection of a group 
of non-responsive individuals with predefined opinion. 
This quantitative analysis is possible due to our novel 
approach to herding as the conditional probabilities to 
agree or disagree with other people’s opinions. This ap¬ 
proach differs from the generally accepted treatment of 
herding as a topology of social interactions’ network]!]. 

To estimate social susceptibility, we use a specific type 
of Internet news discussion. Some Internet news web¬ 
sites provide a commentary section where readers can 
comment and vote (i.e., like or dislike) other readers’ 
comments (see Fig.[T]). A reader can usually vote for any 
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FIG. 1: Internet news and social influence. Consider articles 
that follow some developing news story. Contrary to printed 
newspapers, Internet news websites open some articles for 
commentary by the public and for expressing like or dislike 
votes for each comment. Comments, together with their likes 
and dislikes, are written and voted for from both supporters 
and opponents of the articles statements. Quantitative data 
of likes (t) and dislikes (4-) reveal the conditional probabili¬ 
ties of individual community members to respond positively 
or negatively to others opinions. These conditional probabil¬ 
ities reflect the level of social influence in the community and 
allow monitoring the temporal dependence of the level of so¬ 
cial influence by following articles on the same subject from 
different dates. 



FIG. 2: The social susceptibility Xs as a function of herding 
parameter I. Social susceptibility is a measure of how many 
individuals follow a single one who changes his or her opinion. 
Thus Xs >> 1 (/ ~ 1) makes possible significant social tran¬ 
sitions that are initiated by a small group. Social influence 
vanishes if Xs ~ 0. The case of Xs < 0 corresponds to the 
populations with negative (antagonistic) social influence. 


number of comments, with the restriction of one vote per 
comment. These data constitute a natural large scale so¬ 
cial experiment where the population responds to some 
external signal (i.e., a comment). A comment, however, 
is not completely external, but rather created by a com¬ 
munity member who responds to the comments of other 
community members. Consequently, statistics of Inter¬ 
net comments and responses can be used as a measure for 
mean field opinion dynamics of the corresponding com¬ 
munity. 

Consider a large population of N individuals who are 
debating on a subject S and continuously voting in favor 
of S (up t) or against it (down j,). The debate process 
implies that individuals may change their vote in time. 
In our model, the interaction between individual i and 
any other randomly selected individual j is expressed by 
the fact that the probability per contact of individual i 
to vote down (Tj,*'^) depends on the vote of individual j. 
This conditional probability is given by [l5j | : 


'ij if S j — 1 

l3ij if Sj = 0 


— OiijSj -\- /3zj(l Sj), (1) 


vote down is 


fci 1 ^ 


[asj + - Sj)]. (2) 


i=i 


i=i 


Defining 7 = ^ ^^e mean fraction of individu¬ 

als who vote up, and noting that mean field assumptions 
imply 7 = = 1 - Eq. m may be written as 

P 4 , = 1 - 7 = 7 a -k (1 - 7 )/ 3 , (3) 

resulting in a steady state expression for 7 (the ’’public 
opinion”) as a function of conditional probabilities 


1-/3 

\ + a — P 


(4) 


In order to measure social influence, consider a popu¬ 
lation of N individuals characterized by (a,/3), which is 
perturbed by applying the specific value of mean vote 7 p 
to a fraction p € [0,1] of the population. The new mean 
vote of the population 7 „ is given by 


where Sj is the vote of individual j (sj = 1 for up vote 
and Sj = 0 for down vote) and parameter {f3ij) is the 
probability per contact of individual i voting down given 
individual j is voting up (down), respectively, regardless 
of the vote of individual i prior to the interaction with 
individual j. 

In a well mixed homogeneous population, where the 
number of contacts per individual is fci = N and 
(aijjPij) = {a, (3), the probability of an individual to 


7 „ = (1 - p)[ 7 „(l - a) -k (1 - 7 n)(l - /?)] -k P 7 p. (5) 

The response function of the population R(p) is defined 
by the fraction of players who flip votes in response to 
the perturbation, i.e. outside the perturbation group. 
An explicit expression for R{p) is obtained using Eq. ([S]) 


^ sign(7p - 7)(7„ - 7 ) - p|7p - 7 I 


Ip 


l-/(l-p)’ 

( 6 ) 

































3 


where / = j3 — a. Obviously, the population response 
function R{p) is zero for p = 0 and for a = j3. 

The herding parameter I S [—1,1] is a measure of 
the social influence of one individual on others, because 
I = /3 — a is the difference of conditional probabilities 
for correlated and anti-correlated behaviors, see [TJ It is 
similar to herding or percolation parameter 0 < c < 1 
from 0. However, since our definition of the herding 
parameter / accounts for both positive and negative so¬ 
cial influence, it is better suited for analyzing opinion 
dynamics in binary vote communities. 

The social susceptibility Xs, is defined as 


dR 

dp 


p=0 


I 


(7) 


and is the average size of a group whose members follow 
the change of opinion of a single member (not including 
the initiating member itself). The size of the perturba¬ 
tion group pcrit required to convert a population (a, fi) 
to the mean vote of the perturbed group 7 p (including 
the polarized cases 7 p = 1 , 0 ) is obtained by substituting 
In = 7 p in Eq. dH]) and using Eq. d?]) 


7 IA 7 I 

where A 7 = 7 p — 7 . 

Conditional probabilities (a, /3) define herding /, which 
in turn defines the social stability of the community. To 
calculate conditional probabilities a and as a function 
of likes ti and dislikes li votes for comment i of article 
k (see Eig. [T|), we assume that voters and commenta¬ 
tors populations are equivalent and that the number of 
comments and votes is large enough to apply mean field 
assumption. Consequently, the probabilities for a com¬ 
mentator and a voter to be in favor of the article subject 
S are both equal to 7 . Therefore, the comments should 
consist of two groups with opposite opinions and relative 
sizes 7 and 1 — 7 , respectively. 

According to the definition of the conditional proba¬ 
bilities (Eq. [T]) , the ratio between likes and all responses 
(likes and dislikes) for a positive comment (to S') is 1 — a. 
However, this ratio for a negative comment (to S) equals 
/3 since expressing a like vote for a negative comment is 
equivalent to expressing a dislike vote for the article sub¬ 
ject S commented upon. Consequently, the probability of 
a dislike vote for a comment PdisUke is different from the 
probability Pj, to dislike subject S, as defined in (Eq. [3|). 
Therefore, the probability of a dislike vote for a comment 
is: 


Pdisiike = a 7 -I- (1 - /3)(1 - 7 ) = 2(1 - 7 ) 7 . (9) 

The result is invariant under the transformation 7 —> 
1 — 7 , reflecting the uncertainty regarding the opinion 
of the Internet article itself. Hence, the division of the 
comments into two groups with contrasting opinions does 
not reveal the opinions themselves. Since Xs is invariant 


under the transformation 7 —>■ 1 — 7 , we arbitrarily chose 
7 > 0.5. An interesting consequence of Eq. ([5]) is that 
Pdisiike < 0.5, i.e. comments cannot include only dislikes 
because the community cannot dislike its own opinion. 

Calculating of a, (3 and 7 of the community proceeds 
through iterations. First, all comments are sorted by 
their like vote fraction. Then, at each step n, the com¬ 
ments are divided into two groups with ratio of 7 ” and 
1 — 7 " according to their like vote fraction, where group 
L receives the 7 " comments with the highest like vote 
fraction and group D receives all other comments. The 
population characteristic parameters a" and /?" are then 
calculated according to: 


i-„" = 




iei 


ti + ' 


= E 




ieD 


ti + ti 


( 10 ) 


A new population mean vote 7 "+^ is calculated using the 
values of a" and /3": 
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n+1 _ 


1 -^" 

1 -h a” - /3" 


( 11 ) 


The process is repeated until the convergence of a", /S" 
and 7 ". 


TABLE I: The results of social susceptibility calculation for 
12 CNN articles from different dates that cover three different 
events. For each article, the values of conditional probabilities 
{a, P) and social susceptibility Xs were calculated. 



Article’s topic 

Publish Date 

a 

P 

7 

Xs 

1 

Zimmerman Trial 

24/06/13 

0.14 

0.53 

0.77 

0.63 

2 

Zimmerman Trial 

05/07/13 

0.12 

0.38 

0.84 

0.34 

3 

Zimmerman Trial 

12/07/13 

0.08 

0.47 

0.87 

0.64 

4 

Zimmerman Trial 

13/07/13 

0.04 

0.57 

0.91 

1.10 

5 

Zimmerman Trial 

17/07/13 

0.04 

0.67 

0.89 

1.70 

6 

Zimmerman Trial 

25/07/13 

0.06 

0.77 

0.79 

2.39 

7 

Iran Nuclear Program 

25/10/13 

0.16 

0.58 

0.72 

0.74 

8 

Iran Nuclear Program 

23/11/13 

0.15 

0.56 

0.75 

0.70 

9 

Iran Nuclear Program 

24/11/13 

0.15 

0.59 

0.73 

0.78 

10 US Govt. Shutdown 

01/10/13 

0.16 

0.51 

0.75 

0.53 

11 

US Govt. Shutdown 

02/10/13 

0.13 

0.51 

0.79 

0.63 

12 US Govt. Shutdown 

02/10/13 

0.09 

0.48 

0.85 

0.62 


The formalism of the analysis of the social influence 
presented above is applied to news articles published on 
the CNN website that discuss three different topics. The 
first story includes six articles, published between June 
24th and July 25th, 2013, covering the George Zimmer¬ 
man Trial [ifil - l^ . These articles cover the legal pro¬ 
ceeding, the verdict, and the post-verdict jurors opin¬ 
ions about the trial. The second story includes three 
articles, published between October 25th and November 
25th, 2013, covering the negotiations and signing of the 
Geneva interim agreement on the Iranian nuclear pro¬ 
gram [22l - [^ . The third story includes three articles, 
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FIG. 3: Social susceptibility Xb ^ function of conditional 
probabilities (a, /3) together with the states of different In¬ 
ternet communities according to the analysis of CNN website 
articles (numbered dots). The numbers correspond to the 
cases of TableUthat cover the Zimmerman trail, Iran Nuclear 
Program agreement, and US Government shutdown. 


published on October 1st and 2nd, 2013, covering the 
US federal government shutdown of that year [251 - 1^ . 
These articles cover the first day of the shutdown and 
the White House failing efforts to end it. The results of 
these analyses are presented in Table U and in Fig. O 

All three cases exhibit a continuous dynamics in (a, /3) 
space, as shown in Fig. [3] This result is interesting con¬ 
sidering that the analysis is applied to different articles, 
covering different stories, spanning from days to months. 
It indicates a slow change of opinions in the community. 

In the Iran Nuclear Program and US Government shut¬ 
down cases, the population’s characteristic parameters 
(a,/3) are constant, although they correspond to differ¬ 
ent CNN articles and, in the case of the Iran Nuclear 
Program, span one month. This result may also indi¬ 
cate the absence of special events during the observation 
period. 

The social susceptibility level in the Zimmerman trial 
case changes near the verdict announcement. In the pe¬ 
riod prior to the verdict day (points 1-3 in Fig. [S]), the 
level of social susceptibility in the population remains al¬ 
most constant and similar to the social susceptibility in 
the other cases (i.e., Xs ^ 0.5), despite the changes in a 
and /3. From the verdict day on (points 4-6), the social 
susceptibility in the community grows rapidly and the 
population approaches the singular point (a, /3) —> (0,1). 
It is out of the scope of this work to interpret social phe¬ 
nomena, though the results demonstrate that our method 
allows to observe the otherwise hidden herding level in a 
community together with its response to social triggers. 

The limitations of our work include the absence of 
external force, i.e. government control, and lack of in¬ 
teraction topology constrains, such as the prevalence of 



FIG. 4: The social susceptibility Xs as a function of time. 
Social susceptibility remains almost constant in the cases of 
US Government shutdown and Iran Nuclear Program nego¬ 
tiations. This preservation of community state is surprising 
because there is no reason why the opinion of a population 
should remain the same for short or for long periods. Even 
more interesting, however, is that social susceptibility in the 
case of the Zimmerman trial changes after the verdict is an¬ 
nounced on July 13, 2013. Significant social transition be¬ 
comes possible after the announcement of the verdict. 

near-neighbors interactions. Omitting topological con¬ 
straints seems to be justified in Internet communities. 
The same is true regarding forces that shape opinion or 
add weight to some opinion, such as government control 
or mass media. We assume that the Internet is still a free 
zone. The model can be extended to include such force, 
though there is no clear way to quantify it. 

Shortly after the data collection phase for this work 
was completed, the CNN website has changed its com¬ 
ments policy and the dislike count per comment is no 
longer displayed. This change made the CNN website 
articles and comments unsuitable for the above comment 
analysis procedure, since the main assumption underly¬ 
ing our model, i.e., that both like and dislike vote counts 
are available to all individuals in the population, is no 
longer valid. This study demonstrates the potential of 
both like/dislike votes in estimating the social state of a 
community and may contribute to the evolving formation 
of the Internet news format. 

To conclude, the developed tools for social influence in 
Internet communities reveal the previously hidden level 
of herding and social influence as a function of time in 
populations. In addition, this work provides a measure 
for the stability of public opinion in a community and 
for the size of a group capable to cause critical change 
in average opinion. The presented method can be com¬ 
pared with other methods and can be extended to other 
fields such as financial marketsjl^. Therefore, this work 
enables an intriguing comparison of the herding in the 
same community calculated from different sources, such 
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as Internet news and financial markets. 
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I. SUPPLEMENTARY MATERIAL 


Here we include the detailed procedure to obtain the 
social influence parameter from Internet discussion data. 
The algorithms input are two vectors containing the num¬ 
ber of likes and dislikes each comments received, fj and 
li, respectively (see Fig. tH). The length of these two 
vectors is the number of comments N, usually few thou¬ 
sands. The output is the resulted population parameters 
a and /3 and their error margin. 

The initial value of 'yinit is taken using the probability 
for a voter to be in favor of a comment - P™™, which is 
a measurable parameter given by the ratio of like votes 
to the total votes: 


pcom _ 

M - 


Si ti 


Sj +Si J-i 

Taking into account (jH]) and a = ,5 = 1 — 7 : 


( 12 ) 


(13) 


The initial value 7 i„it is always chosen to be > 0.5. 
Then one proceeds: 


1. Initialization 


(a) Choose value for the number of voters’ thresh¬ 
old: T. Start with T = 10. 

(b) From now on, consider only comments above 
the threshold: ti + ti> T. 

(c) Define the initial value of the mean vote 7 by 
solving the equation: 


Si ti 

Sjti + Siii 


7^ + (l-7)^ 


Take only the solution 7 > 0.5. 


2. Classification of comments 


(a) Order the comments according to their like 
vote fraction: y+j-- 

(b) Divide the comments into two groups with 
ratio of 7 and 1 — 7 according to their like 
vote fraction, i.e., for group L take the 7 com¬ 
ments with the highest like vote fraction and 
for group D take all other comments. 

(c) Calculate the population characteristic pa¬ 
rameters a and /3: 


l-a = ^ 

iGL 


ti + J'i ' 


/^ = E 

iGD 


ti + ti 


(d) Calculate the new population mean vote 7 us¬ 
ing the values of a and /3: 


1-/3 
1 + a — j3 


(e) Repeat stages (a)-(d) until the values of a and 
P converge. 

3. Analyzing 

(a) Increase the threshold for the number of voters 
T by 1, and repeat stages 1-2. 

(b) End when the number of comments above the 
threshold N is less than 50. 

(c) The resulted a and /3 are the weighted mean 
over all permitted thresholds: 

T 

EN ’ 

T 

Em 

_ 

EN ■ 

T 

(d) The resulting aa and are the equivalent 
standard deviations over all permitted thresh¬ 
olds. 


a = 

P = 


In favor 



FIG. 5: Like vote fraction distribution for the comments of 
one CNN article (point 4 in the main article), for threshold 
value of 10. The black line which is determined by 7 , divides 
the comments into two groups of in favor and against the 
subject S. The mean like vote fraction of the against and the 
in favor groups, equal to /3 and 1 — a respectively. 


Fig.[5]presents the like vote fraction distribution for the 
comments of the CNN article announcing the not guilty 
verdict in the Zimmerman trail (point 4 in the article), 
for T = 10. The concept of the classification of comments 
procedure and the way the population parameters a and 
P are extracted can be well understood in this presen¬ 
tation. For sensitivity of the model to the value of the 
threshold T see Fig. 
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T - Number of voters per comment threshold 


FIG. 6 : The sensitivity of the resulting parameters to the 
threshold value T. The black line represents the number of 
comments above the threshold and the blue, red, and green 
lines represent a, P, and 7 , respectively. 






































